370 research outputs found

    On Collaborative Predictive Blacklisting

    Full text link
    Collaborative predictive blacklisting (CPB) allows to forecast future attack sources based on logs and alerts contributed by multiple organizations. Unfortunately, however, research on CPB has only focused on increasing the number of predicted attacks but has not considered the impact on false positives and false negatives. Moreover, sharing alerts is often hindered by confidentiality, trust, and liability issues, which motivates the need for privacy-preserving approaches to the problem. In this paper, we present a measurement study of state-of-the-art CPB techniques, aiming to shed light on the actual impact of collaboration. To this end, we reproduce and measure two systems: a non privacy-friendly one that uses a trusted coordinating party with access to all alerts (Soldo et al., 2010) and a peer-to-peer one using privacy-preserving data sharing (Freudiger et al., 2015). We show that, while collaboration boosts the number of predicted attacks, it also yields high false positives, ultimately leading to poor accuracy. This motivates us to present a hybrid approach, using a semi-trusted central entity, aiming to increase utility from collaboration while, at the same time, limiting information disclosure and false positives. This leads to a better trade-off of true and false positive rates, while at the same time addressing privacy concerns.Comment: A preliminary version of this paper appears in ACM SIGCOMM's Computer Communication Review (Volume 48 Issue 5, October 2018). This is the full versio

    Differentially Private Mixture of Generative Neural Networks

    Get PDF
    Generative models are used in a wide range of applications building on large amounts of contextually rich information. Due to possible privacy violations of the individuals whose data is used to train these models, however, publishing or sharing generative models is not always viable. In this paper, we present a novel technique for privately releasing generative models and entire high-dimensional datasets produced by these models. We model the generator distribution of the training data with a mixture of kk generative neural networks. These are trained together and collectively learn the generator distribution of a dataset. Data is divided into kk clusters, using a novel differentially private kernel kk-means, then each cluster is given to separate generative neural networks, such as Restricted Boltzmann Machines or Variational Autoencoders, which are trained only on their own cluster using differentially private gradient descent. We evaluate our approach using the MNIST dataset, as well as call detail records and transit datasets, showing that it produces realistic synthetic samples, which can also be used to accurately compute arbitrary number of counting queries.Comment: A shorter version of this paper appeared at the 17th IEEE International Conference on Data Mining (ICDM 2017). This is the full version, published in IEEE Transactions on Knowledge and Data Engineering (TKDE

    ReMasker: Imputing Tabular Data with Masked Autoencoding

    Full text link
    We present ReMasker, a new method of imputing missing values in tabular data by extending the masked autoencoding framework. Compared with prior work, ReMasker is both simple -- besides the missing values (i.e., naturally masked), we randomly ``re-mask'' another set of values, optimize the autoencoder by reconstructing this re-masked set, and apply the trained model to predict the missing values; and effective -- with extensive evaluation on benchmark datasets, we show that ReMasker performs on par with or outperforms state-of-the-art methods in terms of both imputation fidelity and utility under various missingness settings, while its performance advantage often increases with the ratio of missing data. We further explore theoretical justification for its effectiveness, showing that ReMasker tends to learn missingness-invariant representations of tabular data. Our findings indicate that masked modeling represents a promising direction for further research on tabular data imputation. The code is publicly available

    Building and evaluating privacy-preserving data processing systems

    Get PDF
    Large-scale data processing prompts a number of important challenges, including guaranteeing that collected or published data is not misused, preventing disclosure of sensitive information, and deploying privacy protection frameworks that support usable and scalable services. In this dissertation, we study and build systems geared for privacy-friendly data processing, enabling computational scenarios and applications where potentially sensitive data can be used to extract useful knowledge, and which would otherwise be impossible without such strong privacy guarantees. For instance, we show how to privately and efficiently aggregate data from many sources and large streams, and how to use the aggregates to extract useful statistics and train simple machine learning models. We also present a novel technique for privately releasing generative machine learning models and entire high-dimensional datasets produced by these models. Finally, we demonstrate that the data used by participants in training generative and collaborative learning models may be vulnerable to inference attacks and discuss possible mitigation strategies

    Novel homogeneous selective electrocatalysts for CO2 reduction: an electrochemical and computational study of cyclopentadienyl-phenylendiamino-cobalt complexes

    Get PDF
    Four cyclopentadienyl-phenylendiamino-cobalt complexes [CoCp(bqdi)] with different substituents (R) at the phenylene moiety (bqdi, I; o-perfluoro-bqdi, II; p-NO2-bqdi, III; p-COOH-bqdi, IV) have been studied with an aim to investigate their capability as catalysts for the CO2 reduction. These compounds were characterized by cyclic voltammetry measurements both under nitrogen and CO2 atmospheres, showing an increase in the cathodic current ranging from 3.36 (III) to 5.59 times (II) that of the measurement under nitrogen. Moreover, with the addition of water, the current enhancement in the presence of CO2 reaches 31.07 times that of the case of complex II. Interestingly, these complexes exhibit very good selectivity toward CO2 reduction irrespective of hydrogen even in the presence of water. The relative turnover frequencies were also estimated, given the values ranging from 3.23 (III) to 187.21 s−1 (II) in the presence of water. In addition, these results were analysed by means of density functional theory (DFT) calculations and Fukui functions analysis. In particular, DFT results clearly show effects of different substituents on the electrochemical properties of these compounds. Whereas, the Fukui functions analysis indicates that the most favourable positions for an electrophilic attack on the reduced complex are the nitrogen and cobalt atoms

    On the Use of Tri-Stereo Pleiades Images for the Morphometric Measurement of Dolines in the Basaltic Plateau of Azrou (Middle Atlas, Morocco)

    Get PDF
    Hundreds of large and deep collapse dolines dot the surface of the Quaternary basaltic plateau of Azrou, in the Middle Atlas of Morocco. In the absence of detailed topographic maps, the morphometric study of such a large number of features requires the use of remote sensing techniques. We present the processing, extraction, and validation of depth measurements of 89 dolines using tri-stereo Pleiades images acquired in 2018–2019 (the European Space Agency (ESA) © CNES 2018, distributed by Airbus DS). Satellite image-derived DEMs were field-verified using traditional mapping techniques, which showed a very good agreement between field and remote sensing measures. The high resolution of these tri-stereo images allowed to automatically generate accurate morphometric datasets not only regarding the planimetric parameters of the dolines (diameters, contours, orientation of long axes), but also for what concerns their depth and altimetric profiles. Our study demonstrates the potential of using these types of images on rugged morphologies and for the measurement of steep depressions, where traditional remote sensing techniques may be hindered by shadow zones and blind portions. Tri-stereo images might also be suitable for the measurement of deep and steep depressions (skylights and collapses) on Martian and Lunar lava flows, suitable targets for future planetary cave exploration

    Sulphur vs NH Group: Effects on the CO2 Electroreduction Capability of Phenylenediamine-Cp Cobalt Complexes

    Get PDF
    The cobalt complex (I) with cyclopentadienyl and 2-aminothiophenolate ligands was investigated as a homogeneous catalyst for electrochemical CO2 reduction. By comparing its behavior with an analogous complex with the phenylenediamine (II), the effect of sulfur atom as a substituent has been evaluated. As a result, a positive shift of the reduction potential and the reversibility of the corresponding redox process have been observed, also suggesting a higher stability of the compound with sulfur. Under anhydrous conditions, complex I showed a higher current enhancement in the presence of CO2 (9.41) in comparison with II (4.12). Moreover, the presence of only one -NH group in I explained the difference in the observed increases on the catalytic activity toward CO2 due to the presence of water, with current enhancements of 22.73 and 24.40 for I and II, respectively. DFT calculations confirmed the effect of sulfur on the lowering of the energy of the frontier orbitals of I, highlighted by electrochemical measurements. Furthermore, the condensed Fukui function f - values agreed very well with the current enhancement observed in the absence of water

    Moving towards happiness? Understanding travel moods through twitter data in Turin

    Get PDF
    The paper will address the following questions: does urban mobility matter for health, and mental health in particular? How does each transport mode relate to our level of stress/happiness? A previous study conducted on Turin (Melis et al. 2015) showed that among indicators related to urban structure and social composition, ‘accessibility by public transport’ seems to be the one with strongest relation with mental health (depression) outcomes. Starting from this results, we decided to further explore this association through the use of data from social media. Recent trends in the use of social networks have opened up new opportunities in the field of urban and transport studies: the great amount of data coming from Twitter is an example, providing easily available, often geo-referenced, marginally costly, datasets offering new insights on individual and collective life. The accuracy and reliability, as well as representativeness of the results coming from the use of this new source of data in the mobility and planning field is undoubtedly growing. The project uses Twitter data collected for the metropolitan area of Turin (IT) and analyses it using a Semantic Analysis algorithm to show spatiotemporal levels of happiness (valence) of users, related to the transport mode they have been using. Geographic Information Systems (GIS) and spatial analysis techniques are then used to visualize spatial patterns and associations among happiness levels and contextual variables, such as land-use. From a methodological point of view, results can be compared to research conducted on US cities by Flint University (Rybarczyk and Banerjee 2015), as the method used is the same. The purpose of the study is exploratory, in order to understand which use can be done of such a rich data source as social media information. Therefore, the results may be used to promote the use of social media data by transportation planners and public health officials for developing more effective transportation plans and policies, as well as to understand the degree of satisfaction/stress linked to different transport modes

    Rheology of Conductive High Reactivity Carbonaceous Material (HRCM)-Based Ink Suspensions: Dependence on Concentration and Temperature

    Get PDF
    The present case study reports a shear rheological characterization in the temperature domain of inks and pastes loaded with conductive High Reactivity Carbonaceous Material (HRCM) consisting mainly of few-layers graphene sheets. The combined effect of filler concentration and applied shear rate is investigated in terms of the shear viscosity response as a function of testing temperature. The non-Newtonian features of shear flow ramps at constant temperature are reported to depend on both the HRCM load and the testing temperature. Moreover, temperature ramps at a constant shear rate reveal a different viscosity-temperature dependence from what is observed in shear flow ramps while maintaining the same filler concentration. An apparent departure from the well-known Vogel-Fulcher-Tamman relationship as a function of the applied shear rate is also reported
    corecore